Univariate TS Models (ARIMA/SARIMA)

ACF & PACF Plots

Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) plots are crucial tools in time series analysis, helping to identify the type of model that best describes a series. The ACF plot shows the correlation of the series with its own lags, providing insights into the overall correlation structure and potential seasonality. On the other hand, the PACF plot reveals the direct effect of past values on the current value, helping to pinpoint the order of autoregressive models.

By examining the ACF and PACF plots, we can discern patterns that suggest the presence of autoregressive (AR) or moving average (MA) components in our time series models. Significant spikes in the ACF plot indicate potential AR terms, while significant spikes in the PACF plot suggest MA terms. These plots also assist in determining the stationarity of the series, a crucial aspect in time series modeling, where non-stationary data often require differencing to achieve stationarity.

In this section, we’ll explore the ACF and PACF plots for our datasets delving into their autocorrelation structures and deriving insights that will inform our model selection and forecasting approach.

The ACF plot for crude oil prices demonstrates prolonged significant autocorrelation, suggesting a non-stationary series. The gradual decline in correlation as lags increase indicates a potential long-term dependency or trend in the data.

The PACF plot shows significant spike at lag 1 and 2, followed by non-significant values

The ACF and PACF plots suggest considering an ARIMA model with ‘p’ to be 1 and 2. The slow decay in the ACF implies that differencing (d > 0) may be necessary to achieve stationarity.

The ACF plot for natural gas prices shows a very slow decay, suggesting non-stationarity and a need for differencing.

The PACF plot shows a significant spike at lag 1, followed by a drop-off.

The ACF and PACF plots suggest considering an ARIMA model with ‘p’ to be 1 and 2. The slow decay in the ACF implies that differencing (d > 0) may be necessary to achieve stationarity.

The ACF plot for electricity shows a strong positive autocorrelation across all the lags indicating a potential MA term and a need for differencing due to non-stationarity.

The PACF plot shows a significant correlations at lag 1. The choice of ‘p’ could be 1 or 2 based on the first significant spikes.

The consistent autocorrelation in the ACF plot suggests a potential need for a higher-order MA term or differencing, leading to an ARIMA(p,d,q) model consideration.

The ACF plot for GDP shows a persistent strong autocorrelation across all lags indicates non-stationarity, suggesting that differencing may be necessary.

The PACF plot have a sharp cutoff after lag 1 indicating an AR(1) process, suggesting that previous values have a significant impact on current GDP.

Given the strong autocorrelation and the PACF cutoff, an ARIMA(1,1,0) model may be a good starting point for modeling GDP, but differencing (d > 0) may be necessary.

The ACF plot for CPI shows a sustained high autocorrelation across lags suggests a non-stationary time series, indicative of CPI’s long memory.

The PACF plot have a sharp spike and a cutoff after lag 1 indicating an AR(1) process, suggesting that previous values have a significant impact on current GDP.

The sustained autocorrelation in the ACF plot implies that differencing might be needed. An initial ARIMA(1,1,0) model could be considered.

Detrend VS Difference

Detrending and differencing are two methods used to make time series data stationary, it involves removing the underlying trend from the data, while differencing focuses on the changes between consecutive observations.

Detrending typically subtracts the estimated trend component from the original series, while differencing transforms the series into the sequence of differences between adjacent values. While detrending addresses the trend, differencing can help eliminate both trend and seasonality, making the series stationary.

Detrended:

The residuals after detrending indicate that linear detrending might not fully account for all underlying components, as patterns still emerge.

First Difference:

The differenced series shows fluctuations around zero, indicative of improved stationarity. While the mean appears stabilized, examining autocorrelation in the differenced data is crucial to confirm stationarity fully.

Detrended:

The detrended series shows periods of volatility, indicating that removing the linear trend doesn’t capture all the data’s dynamics, remaining fluctuations may reflect other underlying components like seasonality

First Difference:

Consistent mean but varying volatility and a spike around 2020, which could be due to the pandemic

Detrended:

The detrended plot of electricity prices displays clear periodic fluctuations, suggesting the presence of seasonality.

First Difference:

The first differenced series oscillates around a central mean, which is indicative of stationarity in the mean of the series. However, the consistent pattern of spikes followed by a return to the mean—indicates a strong seasonal component.

Detrended:

The detrended GDP plot showcases that the residuals seem to have a non-linear component, as evidenced by the gradual decline and subsequent increase over time. The residuals decrease and then slowly begin to rise after the 1980s, accelerating significantly in recent years. GDP growth rate is not constant and a simple linear model may not be sufficient to capture the complexities.

First Difference:

The differenced series predominantly hovers around the zero line, which indicates that this transformation effectively removes the trend from the data, leading to a stationary series in terms of the mean. The substantial spike observed towards the end is likely due to the recent economic downturn due to the COVID-19 pandemic, which represents a significant economic shock not accounted for by typical GDP growth patterns.

Detrended:

The detrended CPI plot reveals residuals that decline over a prolonged period before stabilizing and then increasing. This pattern suggests that a simple linear trend does not fully capture the complexity of the inflationary trend over time.

First Difference:

The first difference plot for CPI demonstrates a series that fluctuates around a central mean value.

Original Vs First Difference

Time series data often embody intrinsic trends and seasonality, which can confound analyses if not properly addressed. Two techniques to transform such data into a stationary are detrending and differencing.

Detrending involves the removal of a trend line from the time series, thus flattening the data into a horizontal line around the mean.

Differencing, on the other hand, involves computing the difference between consecutive observations. This method is effective in eliminating both trend and seasonality, transforming the series into one where the mean level does not change over time. Differencing is a critical step in preparing data for ARIMA (AutoRegressive Integrated Moving Average) modeling.

Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_line()`).

Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_line()`).

Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_line()`).

Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_line()`).

Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_line()`).

Dickey-Fuller Test

The Dickey-Fuller test is a statistical test used to determine the presence of unit root in a time series and, consequently, whether the series is non-stationary. The null hypothesis of the test is that the time series is non-stationary. If the p-value is less than the chosen alpha level (usually 0.05), we reject the null hypothesis and infer that the series is stationary.


    Augmented Dickey-Fuller Test

data:  composite_crude_oil_ts
Dickey-Fuller = -2.8866, Lag order = 8, p-value = 0.203
alternative hypothesis: stationary

Since the p-value is greater than 0.05, we do not reject the null hypothesis and conclude that the crude oil price series is non-stationary, indicating that differencing or transformation may be required to achieve stationarity.


    Augmented Dickey-Fuller Test

data:  citygate_gas_ts
Dickey-Fuller = -2.8807, Lag order = 7, p-value = 0.2055
alternative hypothesis: stationary

The p-value suggests that the natural gas price series is non-stationary, reinforcing the need for potential differencing or other transformations to stabilize the series.


    Augmented Dickey-Fuller Test

data:  total_electricity_ts
Dickey-Fuller = -2.2818, Lag order = 7, p-value = 0.459
alternative hypothesis: stationary

With the p-value well above 0.05, the electricity price series is considered non-stationary, indicating that adjustments are necessary to model the series accurately.

Warning in tseries::adf.test(gdp_ts): p-value greater than printed p-value

    Augmented Dickey-Fuller Test

data:  gdp_ts
Dickey-Fuller = 2.5163, Lag order = 6, p-value = 0.99
alternative hypothesis: stationary

The extremely high p-value indicates that the GDP series is strongly non-stationary, suggesting that it would benefit from differencing or other methods to achieve stationarity.


    Augmented Dickey-Fuller Test

data:  cpi_ts
Dickey-Fuller = -0.76835, Lag order = 9, p-value = 0.9644
alternative hypothesis: stationary

The CPI series is also non-stationary based on the high p-value, indicating that preprocessing steps are needed before further analysis or modeling.

These Dickey-Fuller test results across different datasets corroborate our earlier analyses and ACF & PACF plot interpretations, showing a common theme of non-stationarity in the series.

First vs Second Differencing

In time series analysis, differencing is a technique used to stabilize the mean of a series and make it stationary. When trends and seasonality are present in a time series, they can affect the predictive models. Differencing helps to mitigate these influences by focusing on the changes in the data rather than the actual values.

Differencing operates under the principle of transformation. It is designed to remove specific types of patterns:

  • First Differencing: This method subtracts the current observation from the previous one. It is a powerful tool to eliminate trends and some types of seasonality in the data, providing a clearer view of the underlying cyclical components and irregularities.

  • Second Differencing: When first differencing is not enough to achieve stationarity, or when the time series exhibits a more complex pattern such as a trend within a trend, second differencing can be employed. This involves applying the differencing operation twice, which can further simplify the predictive structure by reducing more complex serial correlations.

AIC & BIC

In time series analysis, choosing the right model is paramount for accurate forecasting. Two of the most critical metrics for model selection are the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC). Both criteria are grounded in information theory and provide a means to balance model fit with model complexity.

  • AIC is a tool for model selection that quantifies the trade-offs between model complexity (the number of parameters in the model) and the goodness of fit. AIC rewards models that achieve a high goodness of fit but penalizes those that become overly complex. A lower AIC value often indicates a preferable model.

  • BIC extends the logic of AIC by incorporating sample size into the penalty for complexity. This adjustment makes BIC more stringent with complex models when dealing with larger datasets. As with AIC, a lower BIC suggests a better model.

   p d q      AIC      BIC     AICc
24 4 1 3 3090.304 3129.861 3090.609
   p d q      AIC      BIC     AICc
24 4 1 3 3090.304 3129.861 3090.609
   p d q      AIC      BIC     AICc
12 2 1 1 3095.071 3117.047 3095.172
   p d q      AIC      BIC     AICc
24 4 1 3 3090.304 3129.861 3090.609
   p d q      AIC     BIC     AICc
17 3 1 1 1031.985 1057.04 1032.162
   p d q      AIC     BIC     AICc
17 3 1 1 1031.985 1057.04 1032.162
  p d q     AIC      BIC     AICc
2 0 1 1 1037.54 1050.068 1037.591
   p d q      AIC     BIC     AICc
17 3 1 1 1031.985 1057.04 1032.162
   p d q      AIC       BIC      AICc
20 3 1 4 -412.675 -374.8158 -412.3046
   p d q      AIC       BIC      AICc
20 3 1 4 -412.675 -374.8158 -412.3046
   p d q       AIC       BIC      AICc
23 4 1 2 -412.1291 -378.4765 -411.8334
   p d q      AIC       BIC      AICc
20 3 1 4 -412.675 -374.8158 -412.3046
  p d q      AIC    BIC     AICc
4 1 1 1 4033.393 4048.3 4033.525
  p d q      AIC    BIC     AICc
4 1 1 1 4033.393 4048.3 4033.525
  p d q      AIC    BIC     AICc
4 1 1 1 4033.393 4048.3 4033.525
  p d q      AIC    BIC     AICc
4 1 1 1 4033.393 4048.3 4033.525
  p d q      AIC    BIC     AICc
4 1 1 1 4033.393 4048.3 4033.525
  p d q      AIC    BIC     AICc
4 1 1 1 4033.393 4048.3 4033.525
  p d q      AIC    BIC     AICc
4 1 1 1 4033.393 4048.3 4033.525
  p d q      AIC    BIC     AICc
4 1 1 1 4033.393 4048.3 4033.525

Fitting ARIMA

Moving average smoothing is a time series forecasting method that can help identify long-term trends by smoothing out short-term fluctuations. By averaging the data over specific time windows, these methods filter out the ‘noise’ and offer a cleaner view of the direction in which the series is moving.

Model Diagnostics

Moving average smoothing is a time series forecasting method that can help identify long-term trends by smoothing out short-term fluctuations. By averaging the data over specific time windows, these methods filter out the ‘noise’ and offer a cleaner view of the direction in which the series is moving.

Auto.Arima()

Moving average smoothing is a time series forecasting method that can help identify long-term trends by smoothing out short-term fluctuations. By averaging the data over specific time windows, these methods filter out the ‘noise’ and offer a cleaner view of the direction in which the series is moving.

Forecasting

Moving average smoothing is a time series forecasting method that can help identify long-term trends by smoothing out short-term fluctuations. By averaging the data over specific time windows, these methods filter out the ‘noise’ and offer a cleaner view of the direction in which the series is moving.

Comparing with Benchmark Methods

Moving average smoothing is a time series forecasting method that can help identify long-term trends by smoothing out short-term fluctuations. By averaging the data over specific time windows, these methods filter out the ‘noise’ and offer a cleaner view of the direction in which the series is moving.